Multivariate Bayesian Kernel Regression Model for High Dimensional Data and its Practical Applications in Near Infrared (NIR) Spectroscopy
نویسنده
چکیده
Non-linear regression based on reproducing kernel Hilbert space (RKHS) has recently become very popular in fitting high-dimensional data. The RKHS formulation provides an automatic dimension reduction of the covariates. This is particularly helpful when the number of covariates ($p$) far exceed the number of data points. In this paper, we introduce a Bayesian nonlinear multivariate regression model for high-dimensional problems. Our model is suitable when we have multiple correlated observed response corresponding to same set of covariates. We introduce a robust Bayesian support vector regression model based on a multivariate version of Vapnik's $\epsilon$-insensitive loss function. The likelihood corresponding to the multivariate Vapnik's $\epsilon$-insensitive loss function is constructed as a scale mixture of truncated normal and gamma distribution. The regression function is constructed using the finite representation of a function in the reproducing kernel Hilbert space (RKHS). The kernel parameter is estimated adaptively by assigning a prior on it and using the Markov chain Monte Carlo (MCMC) techniques for computation. Practical applications of our model are demonstrated via applications in Near-Infrared (NIR) spectroscopy and simulation studies. Our Bayesian non-linear models are highly accurate in predicting composition of materials based on its near infrared (NIR) spectroscopy signature. We have compared our method with popularly used methodologies in NIR spectroscopy, like partial least square (PLS), principal component regression (PCA), support vector machine (SVM), and random forest (RF). In all the simulation and real case studies, our multivariate Bayesian RKHS regression model outperforms the standard methods by a substantially large margin. The implementation of our models based on MCMC is fairly fast and straight forward. Multivariate Bayesian Kernel Regression Model for High Dimensional Data and its Practical Applications in Near Infrared (NIR) Spectroscopy By SOUNAK CHAKRABORTY Department of Statistics, University of Missouri-Columbia, 209F Middlebush Hall, Columbia, Missouri 65211, U.S.A. e-mail: [email protected] 1 *Manuscript Click here to view linked References
منابع مشابه
Prediction of Freshness Quality and Phosphate Residue of White Shrimp Products Using Near-Infrared Spectroscopy
Background: The manufacturing of frozen shrimp is an important industry for the economy of Thailand. The objective of this study was to use Near-Infrared (NIR) spectroscopy to determine the freshness quality, including Total Volatile Basic Nitrogen (TVB-N) and Water Holding Capacity (WHC) of white shrimp (whole and chopped shrimp) and phosphate residues of shrimp. Methods: Sixty white shrimp ...
متن کاملNear-infrared calibration transfer based on spectral regression.
A calibration transfer method for near-infrared (NIR) spectra based on spectral regression is proposed. Spectral regression method can reveal low dimensional manifold structure in high dimensional spectroscopic data and is suitable to transfer the NIR spectra of different instruments. A comparative study of the proposed method and piecewise direct standardization (PDS) for standardization on tw...
متن کاملBayesian kernel projections for classification of high dimensional data
A Bayesian multi-category kernel classification method is proposed. The hierarchical model is treated with a Bayesian inference procedure and the Gibbs sampler is implemented to find the posterior distributions of the parameters. The practical advantage of the full probabilistic model-based approach is that probability distributions of prediction can be obtained for new data points, which gives...
متن کاملDetermination of Protein and Moisture in Fishmeal by Near-Infrared Reflectance Spectroscopy and Multivariate Regression Based on Partial Least Squares
The potential of Near Infrared Reflectance Spectroscopy (NIRS) as a fast method to predict the Crude Protein (CP) and Moisture (M) content in fishmeal by scanning spectra between 1000 and 2500 nm using multivariate regression technique based on Partial Least Squares (PLS) was evaluated. The coefficient of determination in calibration (R2C) and Standard Error of Calibra...
متن کاملNear-infrared spectroscopy and hyperspectral imaging: non-destructive analysis of biological materials.
Near-infrared (NIR) spectroscopy has come of age and is now prominent among major analytical technologies after the NIR region was discovered in 1800, revived and developed in the early 1950s and put into practice in the 1970s. Since its first use in the cereal industry, it has become the quality control method of choice for many more applications due to the advancement in instrumentation, comp...
متن کامل